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Abstract 

Energy inefficiency in irreversible logic circuits is creating an 
obstruction on the path towards continued advancements in 
complexity and reductions in the size of today's computer 
systems. Designing the component circuits in a reversible 
manner may offer a possible solution to this crisis, allowing 
significant reductions in power consumption and heat 
dissipation requirements. Multi-valued (MV) reversible logic 
can provide further advantages over binary reversible logic, 
such as better performance or reducing wiring congestion. 
The current literature, however, contains very little work on 
testability of such designs. This paper describes the design of 
an online testable block for ternary reversible logic. This 
block implements most ternary logic operations and 
provides online testability for a reversible ternary network 
composed of several of these blocks. The testable block is 
composed of reversible building blocks, and thus is itself 
reversible. Multiple such blocks can be combined to 
construct complex and complete, testable, ternary reversible 
circuits. 
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Introduction 

Because of the irreversible nature of today's circuits 
information is lost as processing takes place, and as 
information is lost energy is lost in the form of heat 
dissipation. The use of reversible transformations can 
preserve energy by conserving information (Frank 
2005), and in fact as early as 1961 researchers had 
proven that reversible computing could offer a 
solution to the problem of information loss and heat 
dissipation (Landauer 1961; Bennett 1973).Testing and 
reliability of computer systems is very important; 
however, the area of fault detection in reversible 
circuits is fairly new. Recent works such as (Kole, 
Rahman, Das, and Bhattacharya 2010; Polian, Fiehn, 
Becker, and Hayes 2005) and (Zhong and Muzio 2006) 
have focused on this area. The principal motivation 
behind work on ternary online testing lies in quantum 
technologies, which can be both binary and ternary. 
However, ternary logic provides many advantages 



over binary such as high computational speed (Miller 
and Thornton 2008). Quantum logic operations are 
also reversible (Khan and Perkowski 2007), hence 
research on online reversible testability may assist the 
development of test methods for quantum circuits. 

In this paper we introduce an online testable block for 
use in the construction of ternary reversible circuits. 
We discuss the limitations of previously proposed 
designs (Rahman and Rice 2011b; Rahman and Rice 
2011a), and introduce an upgrade. Additional 
approaches are also proposed to reduce the quantum 
cost and provide comparisons. 

Background 

We begin with some background information in order 
to aid readers in their understanding of this work. 

Ternary Operations 

A multi-valued logic (MVL) is a logic system which 
utilises variables that can take on a discrete and finite 
set of values (Miller and Thornton 2008). For example, 
the binary logic system deals with values and 1, 
whereas a three-valued, or ternary system is a MVL 
that deals with values 0, 1 and 2. A finite set of values 
including two identity elements and a set of operators 
are the basic elements of the algebra for a MVL. These 
identity elements and operators are well defined over 
the finite set in one-place and two-place functions. We 
use the term one-place to refer to the functions that 
have one operand and the term two-place to refer 
functions that have two operands. The finite sets of 
logic values generally have cardinality p > 3 in a MVL 
system. 

The commonly used operators in a ternary system are 
given in Table 1. It is noted that the • symbol for the 
product operator is often omitted, as in regular 
multiplication. This practice is followed later on in e.g. 
Figure 6(A). 

Reversible Logic 

The following definitions are fundamental to the area 
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of reversible logic (Shende, Prasad, Markov, and 
Hayes 2003): 

Definition 1. A gate is reversible if the (Boolean) function 
it computes is bijective. 

Definition 2. A well-formed reversible logic circuit is an 
acyclic combinational logic circuit in which all gates are 
reversible, and interconnected without fanout. 

Table 2 defines the behaviour of some of the more 
commonly-used binary reversible gates. 

TABLE 1. SOME COMMON TERNARY OPERATORS. 



Operator 


Behaviour 


mod-sum x ffl y = (x+y) mod p 


ffl 


12 



1 
2 


12 
12 
2 1 


mod-difference x-y = (x-y) mod p 




12 



1 
2 


2 1 
10 2 
2 10 


Product x . y = (x . y) mod p 




12 



1 
2 



12 
2 1 


Successor x"*= (x +1) mod p 


X 





1 
2 


1 
2 




TABLE 2. A SELECTION OF BINARY REVERSIBLE LOGIC GATES. 



gate 


behaviour 


NOT 
Feynman 
Toffoli 
Fredkin 


(x) (xffi 1) 
(x,y) -> (x, x © y) 
(x,y,z) (x, y, xy © z) 
(x,y,z) (x, z, y) iff x=l 
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FIGURE 1. (A) THE NOT GATE, (B) THE TOFFOLI GATE, AND (C) 
A REVERSIBLE FULL ADDER. 

Binary reversible gates are cascaded together in order 
to construct binary reversible circuits. For example, 
Figure 1 illustrates a binary full adder design using 
NOT and Toffoli gates. The presence of garbage 
outputs is sometimes necessary in a reversible circuit 
to maintain the reversibility. Garbage can be defined 
as the number of unutilized outputs required to 
convert an irreversible function to a reversible one 



(Maslov and Dueck 2003). Some authors do not 
consider primary inputs (input variables) or their 
complemented forms to be garbage outputs 
(Chowdhury 2006; Khan and Perkowski 2003), and a 
similar assumption is made here. 



T 



B -\ 7 \-Q B-(J>- A©B 



A- 

B- 



A 
B 

AB a C 



(A) 

A — (x) — A 
B — ®— B 

c-fz~T- P 



(B) 



(C) 




(D) 



(E) 



FIGURE 2. (A) THE M-S GATE, (B) THE 2-INPUT FEYN- MAN 
GATE, (C) THE 3-INPUT TERNARY TOFFOLI GATE, (D) THE 
GENERALIZED TOFFOLI GATE (GTG), AND (E) THE DUAL 
SHIFT GATE. 

Ternary Reversible Logic 

We provide here descriptions of the ternary gates that 
are used in this work. A note on terminology is 
relevant here; the lines or variables operated on in MV 
reversible computing are often assumed to have some 
quantum implementation, and thus the commonly 
used term in binary quantum reversible logic for a 
variable is qubit, while the extension to ternary is 
qutrit and in general for MV logic the term used is 
qudit. 

1) 1-qutrit Permutative Gates 

Any transformation of the qutrit state can be 
represented by a 3 x 3 unitary matrix (Khan, 
Perkowski, and Khan 2004; Khan 2008). This 
transformation is known as a Z-transformation 
which shifts or permutes the input states to 
generate the desired output states. For example, the 
Z(+l) transformation shifts the input states by 1. 
There are numerous Z-transformations that can be 
defined by varying the 3 x 3 matrices, but the most 
useful transformations are shown in Table 3. There 
are five ternary one-place operations 
corresponding to the permutation of ternary values 
0, 1 and 2 and each of these can be constructed as 
ternary reversible gates. These gates are known as 
ternary 1-qutrit permutative gates. Table 3 shows 
the operations of these 1-qutrit permutative gates 
as defined by Khan (2008). Two 1-qutrit ternary 
gates act as another 1-qutrit ternary gate if they are 
cascaded together (Khan 2008). Table 4 shows the 
resultant 1-qutrit gates for two cascaded 1-qutrit 
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gates as defined by in (Khan 2008). 



TABLE 3. OPERATIONS OF 1-QUTRIT PERMUTATIVE GATES. 



Input 


Output of z-transformation 




Z(+l) 


Z(+2) 


Z(12) 


Z(01) 


Z(02) 
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TABLE 4. OPERATIONS RESULTING FROM CASCADING TWO 1-QUTRIT 
GATES. 



First 1-qutrit 
gate 
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02 
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2) M-S Gates 



The Muthukrishnan-Stroud, or M-S gate is a 2- 
qutrit MV gate which can be realized using ion- 
trap technology (Khan and Perkowski 2007). The 
M-S gate is generally represented as shown in 
Figure 2(A). The value of output Q is controlled by 
the value of the input A. Qis the Z-transformation 
of the input B whenever A = 2, where Z e 
{+1,+2,12,01,10}. If A = 2 then input B is passed 
unchanged as Q = B. The Z transforms are 
described in Table 3 (Khan and Perkowski 2007). 

3) Ternary Toffoli and Feynman Gates 

Toffoli gates and 2 * 2 Feynman gates are used 
extensively in this work. The notations used in this 
paper are shown in Figure 2. Quantum realizations 
using M-S gates to implement ternary Toffoli and 
Feynman gates are discussed by Khan and 
Perkowski (2007) as well as in a later work (Khan 
2009). 

4) Generalized Toffoli Gates (GTGs) 

In a generalized Toffoli gate (GTG), whenever the 
values of the two controlling inputs are 2 a Z- 
transformation is applied on the controlled input to 
generate the output. For all other combinations of 
the controlling inputs, the controlled input is 
passed unchanged. However, to make the 
operation more generalized, i.e. to allow the Z- 
transformation to be activated for other 
combinations of the controlling inputs (other than 2, 
2), (Khan and Perkowski 2007) proposed a more 
general version of this Toffoli gate. The symbol for 
Khan's GTG is shown in Figure 2(D), where x and 
y indicate the controlling values for each line. 



5) Dual Shift Gates 

Six 1-qutrit ternary shift gates based on GF3 
operations were proposed in (Khan 2004). In this 
work we utilise one of these, the dual shift gate. 
The dual shift gate has the functionality x^ = (x+2) 
mod 3 and is illustrated in Figure 2(E). 

6) Cost Metrics 

One simple way to compare reversible circuits is by 
counting the gates required in a circuit's 
implementation. However, gate count does not 
consider the complexity of the gates, and thus is 
often an inaccurate measure of the size of a 
reversible circuit. For example, if a circuit can be 
constructed either using five Toffoli gates with 2 
control lines, or two Toffoli gates each with 10 
control lines, the gate count metric will consider 
the second approach to be better since it uses only 
two gates to realize the circuit. However, this is 
inaccurate since a 10-bit Toffoli gate has higher 
complexity, and requires additional quantum 
operations when compared to a 2-bit Toffoli gate 
(Nayeem 2011).(Maslov and Dueck 2004) defined 
the quantum cost of a reversible gate as the number 
of quantum operations required to realize that gate. 
The M-S gate is considered to be an elementary 
quantum building block which was defined by 
Khan and Perkowski (2007) as having a cost of 1. 
This gate is used as the cost metric in this work. 
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FIGURE 3. Rl AND R2 GATES (VASUDEVAN, LALA, DI AND 
PARKERSON 2006). 

Fault Models 

Fault modeling refers to the detection and modeling of 
the behaviour of possible defects in a circuit. There are 
various traditional models of faults such as stuck-at 
faults and bridging faults (Jha and Gupta 2003), as 
well as models suggested specifically for reversible 
implementations such as missing gate faults (Polian, 
Fiehn, Becker, and Hayes 2005) and cross point faults 
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(Zhong and Muzio 2006). The fault model used in this 
work is the single-bit fault used in (Vasudevan, Lala, 
Di, and Parkerson 2006). Whenever an internal circuit 
error changes the value of an output value, a single bit 
error occurs. The error is reflected in any one of the 
output values of the block. Single bit errors are very 
similar to stuck-at faults; however, stuck-at faults 
behave independently of the inputs whereas the 
behaviour of a single bit error is dependent on the 
initial input values. The ternary online testing blocks 
proposed in this paper are designed to identify single 
bit errors propagated to one output line within the 
testable blocks. This uses a method similar to that 
proposed for the binary case by other authors 
(Vasudevan, Lala, Di, and Parkerson 2006). The errors 
are then propagated to the circuit outputs via the use 
of a ternary two-pair two-rail checker. Since we are 
working with ternary logic, a single bit error should 
technically be referred to as a single qutrit error; 
however for the sake of simplicity we use the term 
"single bit error" through-out the rest of this paper. 

Online Testability 

Online, or concurrent testability refers to the ability of 
a circuit to test a portion of itself without halting the 
operation (Vasudevan, Lala, Di, and Parkerson 2006). 
Online testing includes detecting a fault, the point of 
occurrence and sometimes, fault recovery (Jha and 
Gupta 2003). 

Related Work 

Previous work included the proposal of binary 
reversible online testable logic blocks (Vasudevan, 
Lala, Di, and Parkerson 2006). The proposed blocks Rl 
and R2 are used in pairs and cascaded together for the 
design of testable reversible logic circuits. Arbitrary 
binary functions are implemented by the Rl gate, and 
the R2 gate implements the online testability. As well 
as duplicating inputs, the R2 gate also generates the 
parity of its inputs, and generates as well as the 
complement of its input R at S only if the inputs 
remain unchanged. Thus R = S indicates a fault in the 
circuit. Figure 3 shows the proposed Rl and R2. The 
same work also proposes a rail checker circuit to 
detect and carry forward flaws in testable blocks in a 
larger circuit. 

(Mahammad and Veezhinathan 2010) proposed an 
approach to directly construct an online testable circuit 
from a given reversible circuit. The authors proposed 
two steps for this construction. In the first step, every n 



x n reversible gate G in that circuit is transformed into 
a new (n + 1) x (n + 1) Deduced Reversible Gate DRG 
(G). This can be achieved by adding an extra input bit 
Pia and the corresponding output bit Poa to the 
reversible gate G, maintaining the original 
functionality of the gate. Figure 4 shows the 
conversion. 
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FIGURE 4. G GATE AND DEDUCED REVERSIBLE LOGIC GATE 
DRG(G). 

In the second step, a testable reversible cell of G called 
TRC(G) is constructed by cascading the DRG(G) with 
a deduced identity gate. An identity gate is an n * n 
reversible gate where all the inputs are simply copied 
to the outputs. For instance if X is an n * n identity 
gate, then its deduced identity gate (DRG(X)) can be 
easily constructed from X by adding the extra input bit 
Pa and the corresponding output bit Pob. 
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FIGURE 5. CASCADED DRGS TO FORM A TRC. 



DRG(G) and DRG(X) are then cascaded by connecting 
the first n outputs of DRG(G) and the first n inputs of 
DRG(X). 

The new (n + 2) * (n + 2) block is called a Testable 
Reversible Cell, TRC (G) which is the final online 
testable gate. The construction is shown in Figure 5. 

Another work (Nayeem and Rice 2012) presented 
more current research related to online testing of 
ternary Toffoli gates. In this paper the authors 
discussed an online testing approach as applied to a 
cascade of ternary Toffoli gates. In this approach one 
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additional line (referred to as a parity line) is added to 
the circuit. Each input and output line of the circuit is 
connected to the additional parity line by one 2-qutrit 
Toffoli gate and one 2-qutrit modified Toffoli gate. 
Each n-qutrit Toffoli gate of the original circuit is 
replaced by a n + 1-qutrit Toffoli gate and connected to 
the parity line. The resultant circuit is proved to be 
online testable. 

Testable Ternary Logic Block 

In this section, we describe our design of the online 
testable ternary reversible logic block. The basic 
concept is similar to that described in (Vasudevan, 
Lala, Di, and Parkerson2006), however, we have 
extended this work to ternary and proposed different 
approaches to reduce the implementation cost. As first 
described in (Rahman and Rice 2011b) and extended 
in (Rahman and Rice 2011a), the online testable 
ternary reversible logic block (TR) is composed of two 
individual blocks referred to as TR1 and TR2, which 
are described below. 

TR1 Block 

The logic needed for the functionality of the circuit is 
implemented using the proposed TR1 block, as shown 
in Figure 6(A). All operations in this figure and 
subsequent figures are ternary operations as defined in 
Table 1. The outputs L, M and N implement the 
ternary operations and error detection is performed 
using the output Q. Q = P©A©B©C should be 
equivalent to the sum of the outputs L, M and N, and 
input P. The operations are independent of the input P 
and P is set to an arbitrary value which is used in the 
testability feature. The basic ternary operations such as 
AND, EXOR, successor, negation/complement, mod- 
sum and mod-difference can be implemented by the 
TR1 block. As an example, to find the successor — 
where x = 1, we need to set the inputs A = 1, B = 1, C = 
and P = in TR1 to produce the desired result of M = 
A e B = 2. 

TR2 Block 

The online testing is achieved by the TR2 block. Figure 
6(B) shows the functionality of the TR2 block. A single 
bit error is detected using the output S when TR2 is 
cascaded with a TR1 block to create an online testable 
block (TR). In a TR block, the TR2 block receives the 
first three outputs of the TR1 block as inputs and 
generates copies of each, along with the error 
detecting output S. If any error occurs inside the TR2 



block, it is reflected on the output S. 
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FIGURE 6. (A) THE TR1 BLOCK AND (B) THE TR2 BLOCK. 

Online Testable Block (TR) 

TR1 and TR2 are cascaded together to construct an 
online testable ternary reversible block (TR). The TR 
block formed by connecting a 4 x 4 TR1 to a 4 x 4 TR2 
by their first three outputs and inputs is a 5 x 5 block. 
In an online testable block, input P of the TR1 block 
and input R of theTR2 block must be set in such a way 
that R = P. Since we must set R to be a successor of P, 
we can set P = and R =1, P = 1 and R = 2 or P = 2 and 
R = 0. For regular operation we have chosen to set P = 
and R = 1. Figure 7(A) shows the configuration and 
Figure 7(B) shows the block diagram for the TR 
block.TRl takes ternary logic values for implementing 
the desired functionality at A, B, C, and P is set to 0. 
Output Q generates P © A © B © C where P = 0. TR1 
has been constructed so that A © B © C can be equal 
to L © M © N only if no error occurs inside TR1. TR2 
transfers the input values D, E, F to outputs U, V and 
W, where D = L, E = M and F =N. TR2 also generates 
the error detecting output at S. The output S will be 
the successor of Q if no error occurs in TR1 and TR2. If 
any error occurs, output S will no longer be a 
successor of Q. 
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FIGURE 7. ONLINE TESTABLE BLOCK TR 
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1) Limitation of the TR Block 

In ternary GF3 logic, adding 3 to a variable leaves 
the variable unchanged, e.g., A © 3 = A © 1 © 1 © 
1 = A. Let us assume that a copy of the input A is 
required. If inputs B and C are set to 1, then we 
have U = A © 1 and V = A © 1 at the outputs of a 
TR. If we repeat this process thrice using three TR 
blocks, each time providing A = A © 1 from the 
previous block and B = C = 1, at the end of the third 
operation we have U = A©1©1©1 = A and V = 
A©1 ©1 © 1 = A. Thus we have produced a 
copy of input A. In this design, five TR blocks and 
two rail checkers (RC) are required to generate a 
single copy. Each TR block requires 61 M-S gates 
and each RC requires 18 M-S gates. Therefore, the 
number of gates required for a single copy 
operation is 3 * 61 + 2 * 18 = 219, which is very 
large. This inefficiency of the TR block can be 
improved in different ways, as discussed in the 
following sections. 

Upgraded Design of the TR1, TR2 and TR Blocks 
(5TR1, 5TR2, 5TR) 

One disadvantage of the original TR block is that it 
requires three TR blocks and two rail checkers to 
generate a copy of one input, which creates a large 
amount of overhead. Hence, the 4 x 4 TR1 block that is 
part of the TR block can be changed into a 5 « 5 block 
by adding an additional constant input and an 
output O = A. The output function Q will also be 
changed. To distinguish between the 4 x 4 TR1 block 
and the 5 x 5 TR1 block in this literature, they are 
referred to as 4TR1 and 5TR1 correspondingly; 
similarly the 4x4 TR2 block is referred to as 4TR2.We 
define the new 5 x 5 TR1 block (5TR1) as shown in 
Figure 8(A) and 5 x 5TR2 block (5TR2) as shown in 
Figure 8(B). To distinguish between the 4 x 4 TR2 
block and 5 x 5 TR2 block we refer to them as 4TR2 
and 5TR2 respectively. 
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FIGURE 8. (A) 5 x 5T Rl (5TR1) BLOCK, (B) 5 * 5T R2(5TR2) BLOCK, 
AND (C) 6 x 6T R (5TR) BLOCK. 

Two-pair Two-rail Checker 

As described in (Nikolos 1998), error checking is often 
implemented using one of two main techniques: parity 
codesn or two-pair two-rail checkers. Rail checkers 
compare the outputs from more than one identical 
system. This process is also used to reduce the number 
of error detecting outputs (Nikolos 1998). The two-pair 
two-rail checker receives two pairs of inputs and 
generates a pair of outputs that indicates if the prior 
operations were fault-free or faulty. A ternary two- 
pair two-rail checker should receive two pairs of 
inputs from two TR blocks, compare them and, 
indicate in the outputs if any flaw was identified by 
either of those TR blocks. This concept has also been 
used in the reversible context by other authors such as 
in (Farazmand, Zamani, and Tahoori 2010) and 
(Vasudevan, Lala, Di, and Parkerson 2006). The 
purpose of the rail checker is to detect the existence of 
a flaw, if there is any, in the TR blocks attached to the 
rail checker. This is achieved by checking whether the 
outputs of the blocks are successors or not. The rail 
checker generates two outputs where one is successor 
to another if the attached blocks are fault-free. Since 
these outputs may be used afterwards to cascade 
additional rail checkers, successor outputs are 
generated to represent the fault-free situation. 
Otherwise, if the rail checker detects any flaw in the 
attached TR blocks, the design guarantees that the 
outputs generated will never be the successor of each 
other. The rail checker is designed in such a way that if 
the input pairs are successors, i.e. yO = x0^ and yl = 
xl~*, the rail checker will generate X3 = 1 and X4 = 2, so 
that X4 = X3~*, otherwise it generates outputs where X4 
* X3~\ There is no specific reason for choosing X3 = 1 
and X4 = 2. Any two successors could have been used, 
for example X3 = and X4 = 1 or X3 = 2 and X4 = 0. 

Example Circuit 

We implement the GFSOP expression F = ab © cd to 
demonstrate that the proposed blocks in this work can 
successfully implement a ternary GFSOP. Figure 9 
shows the realization of function F using the proposed 
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online testable ternary reversible blocks. The final 
outputs of the second rail checker can be used for 
cascading if the function needs to be further extended. 

Internal Designs 

5TR Block 

The designs of the TR1 and TR2 block have gone 
through a number of evolutions as we continue to 
improve the design. The final versions of the 5 x 5 TR1 
(5TR1) and 5 x 5 TR2 (5TR2) blocks are designed using 
a number of 2 x 2 Feynman gates and a cascade of 
GTGs. To reduce the number of M-S gates in the new 
design of 5TR1, a new gate named the Modified- 
Feynman (MF) gate has been proposed, as illustrated 
in Figure 10. 

Figure 11 shows the internal design of the 5TR1 block. 
This design includes four 2x2 Feynman gates, one MF 
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FIGURE 10. (A) INTERNAL DESIGN OF THE MODIFIED 
FEYNMAN (MF) GATE AND (B) SYMBOL FOR THE MF 
GATE. 

gate, one dual-shift gate and four GTGs. It can be 
verified from the internal structures of the 
constructing gates that the number of M-S gates 
required for each gate is as follows: 

• a Feynman gate requires 4 M-S gates, 

• the dual-shift gate requires one M-S gate, 

• the MF gate requires 4 M-S gates, and 

• each GTG requires 7 M-S gates (Khan and 
Perkowski 2007). 
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Therefore, the total number of M-S gates required to 
implement the 5TR1 block is 49 which is 55.4% less 
than the first design of the 4TR1 block discussed in 
(Rahman and Rice 2011b). However, the most 
significant achievement of this design is the number of 
garbage outputs, which is zero for this design. 



However it should be mentioned here that when this 
block is used to realize a ternary circuit, some of the 
outputs of the TR block may become garbage outputs 
if they are not used in any further operation. Figure 11 
shows the new 6 x 6 TR block consisting of a 5TR1 
cascaded with a 5TR2 and addressed as the 5TR block. 
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The total number of M-S gates required to construct 
this 5TR block is (49 + 16) = 65. 

Two-pair Two-rail Checker 

To implement the rail checker discussed above, an 
elementary gate (E) with two controlling inputs and 
one controlled input has been designed. The design of 
the E gate is based on the architecture of the 1-qutrit 
ternary comparator circuit proposed in (Khan 2008). 
Figure 12 (A) and (B) shows the block diagram of the E 
gates. The behavior of the Z-transformation depends 
on whether the second input (y) is successor to the first 
input (x) or not. If the second input is a successor of 
the first input, the Z-transformation changes the 
controlled input, as previously described, to either a 1 
or a 2. Otherwise, is passed unchanged through to K. 
There are actually two E-gate designs, one with an 
output of 1 if xO = yO^ and one with an output of 2 if 
xl = yl - * where xO, yO and xl, yl are outputs from two 
attached TR blocks. 

The controlling inputs (x, y) of the two E-gates 
comprise the four inputs of the rail checker circuit and 
the controlled outputs Ka and Kb comprise the two 
outputs of the rail checker. Each controlled input is a 
constant input set to the value zero. One each of Ea 
and Eb are used to construct the ternary rail checker. 

The first E-gate (Ea) receives xO and yO as its inputs 
and generates Ka = 1. This is generated by the Z (+1) 
transformation on the controlled input, but only if yO = 
x0~\ The second E-gate (Eb) that takes xl and yl as 
inputs generates Kb = 2 at the output by applying the 
Z(+2) transformation on the controlled input in the 
case where yl = xl^. Figure 12(C) shows the block 
diagram of the internal architecture of the ternary rail 
checker. From this figure we can see that since error 
detecting signals X3 and X4 are generated out of two 
physically separated E-gates any single bit error in 
internal lines will affect one of the two outputs, but 
not both. 

Approaches to Utilizing the Online Testable 
Block 

The benchmark circuits from (Denier, Yen, Perkowski, 
and Kerntopf 2004) can be implemented using the 
proposed 5TR block and the rail checker. However, in 
reversible logic it is often necessary to "copy" input 
values to avoid fanout. Therefore, it is possible to 
reduce the cost of implementation in terms of the 
number of gates if the copy operation is performed 
intelligently rather than using multiple 5TR blocks for 



generating copies. In the following sections, distinct 
blocks are proposed to implement the copy operation. 



xO- 



-2 



yO — i — | +i 




-2 



+2 



+2 



+2 



(A) 



x1 

yi 

o • 



-2 



+1 



-1 



*2 



+2 



^2 



+0 



+2 



3> 



xO 



x1 

yT 



(B) 



xQ 
yO 



xl 



X0 

1o 



(y="3 



+2 



X3 



X4 



(C) 

FIGURE 12. (A) THE INTERNAL STRUCTURE OF THE Ea GATE, 
(B) THE INTERNAL STRUCTURE OF THE Eb GATE, AND (C) 
THE INTERNAL ARCHITECTURE OF THE TWO-PAIR TWO- 
RAIL CHECKER CIRCUIT. 

Method 1: Combination of5TR Block and Online 
Testable Copy Gate 

This block can be used to generate copies of a single 
input but will also incorporate the online testability 
feature. The maximum number of outputs excluding 
the error checking output is four in a 5 x 5 block. 
Therefore the online testable copy gate can generate 
four copies of a single input. However, generating 
four copies involves more M-S gates as well as 
increase the number of garbage outputs if only two 
copies are required, as in most of the cases. Thus, we 
have limited our design of TRco Py to generate only 
three copies of the input, although the flexibility of 
designing a copy gate for four copies still exists. Figure 
13(A) shows the internal design of the online testable 
copy gate. For the copy operation 5TR1 is replaced 
with the new 5 x 5 TRco Py block and cascaded with a 
5TR2 block for the online testability feature. The new 
5TR block constructed from TRco Py and 5TR2 is 
referred to as TRc which generates three copies of a 
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single input. The error detection policy is identical to 
that used in both the 4TR and 5TR blocks. TRco Py 
requires 8 and 5TR2 requires 12 M-S gates. Therefore, 
a TRc block requires 24 M-S gates in total. 
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FIGURE 13. INTERNAL DESIGNS OF THE (A) 6 x 6 TRc BLOCK AND (B) 6*6 
TRmc BLOCK. 

Method 2: Combination of5TR Block, TRc and 
Multicopy Gate (TRmc) 

Copies of multiple variables are required for some 
logic operations. For example, two copies of A, two 
copies of B and two copies of C are required for the 
benchmark 3CyG2 which implements the function ab 
© be © ca. Three TRc blocks are required to generate 
the copies in this benchmark. To avoid this situation, 
the design of TRco Py can be modified to generate two 
copies of two different variables. Figure 13(B) shows 
the configuration. This multi copy gate is referred as 
TRmuiticopy. 16 M-S gates are required to realize this gate. 
Online testability is incorporated by cascading the 
TRmuiticopy and the 5TR2 block. The new cascaded block 
is referred as TRmc. Therefore, a TRmc block requires 32 
M-S gates in total to generate one copy for each of the 
two variables whereas TRc would require 48 M-S gates 
to perform the same operation. Although the number 
of M-S gates is increased in TRmc, this block can reduce 
the number of M-S gates to 33% where copies for 
multiple variables are necessary. In the 3CyG2 
benchmark function, 24 * 3 = 72 M-S gates are required 
to implement the copy functions, whereas 56 M-S 
gates are required if one TRc and one TRmc are used. In 
another benchmark circuit, 4CyG2, which implements 
ab ® be © cd © da, 24*4 = 96 M-S gates are required if 
TRc is used whereas implementation using two TRmc 
requires only 64 M-S gates. For the best result a 
combination of TRc and TRmc blocks can be used. The 
number of M-S gates required to construct each 



proposed block is shown in Table 5. This table can be 
used to determine the quantum cost of any circuit 
constructed using the proposed blocks. 

TABLE 5. NUMBER OF M-S GATES REQUIRED TO IMPLEMENT THE 
PROPOSED BLOCKS. 



Blocks 


Number of M-S gates 


4TR 


4TR1 


49 


4TR2 


12 


Total (4TR) 


61 


5TR 


5TR1 


49 


5TR2 


16 


Total (5TR) 


65 


TRc 


TRcopy 


8 


5TR2 


16 


Total (TRc) 


24 








TRmc 


TRmuiticopy 


16 


5TR2 


16 


Total (TRmc) 


32 








RC 


18 



Discussion 

Comparisons of the Approaches 

Table 6 presents the number of M-S gates, which is 
used as the cost metric in this work, required to realize 
the benchmark circuits provided by Denier, Yen, 
Perkowski, and Kerntopf (2004) using the design 
approaches discussed so far. The cost includes the 
number of M-S gates required to implement the 
testable blocks as well as the rail checkers. 

Table 6 shows that the fewest M-S gates are required 
when using the non-testable ternary gates. This is 
unsurprising because incorporating testability features 
always adds some overhead to the circuit. However 
testable circuits are far more robust and fault tolerant 
than non-testable circuits, hence the trade off between 
the overhead and testability is considered to be 
justifiable. It can be seen from Table 6 that the basic 
approach using only 5TR blocks and rail checkers 
requires more M-S gates than the other approaches for 
benchmarks 3CyG2 and a2bccG but almost equal 
numbers of gates for the remaining benchmarks. The 
approach of using TRc blocks along with 5TR blocks 
generates the best result whenever the circuit requires 
more than two copies of an input variable. The design 
of TRc can be easily upgraded to generate 4 copies of a 
single input variable, although the upgraded design 
would require 8 more M-S gates. If the function to be 
implemented consists of several variables which 
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require 4 copies each, the upgraded design would 
further reduce the required number of gates in total. 
However the proposed design is the best for functions 
consisting of variables which require at most 3 copies 
each. 

The approach using both TRc and TRmc blocks along 
with the 5TR blocks requires the lowest number of M- 
S gates to implement the benchmark circuits. This 
approach is the most efficient since it uses a best fit 
approach to use either of the TRc or TRmc block to 
achieve the best result. This approach is the most 
efficient if the design requires at most two copies of 
the input variables and also requires copies for a large 
number of variables. It is evident from Table 6 that this 
approach requires a lower number of M-S gates for 
benchmarks 2CyG2, 2CyG3 and a2bccG and an equal 
number of gates as compared to the previous 
approach for the remaining benchmarks. 

Overhead Analysis 

In this work, overhead can be approximated as the 
total number of extra elementary gates required to be 
added for the realization of the ternary online testable 
circuit as compared to the original gate count of the 
non-testable realization of the same circuit. 

We have compared our overhead with that resulting 
from the implementation of a binary online reversible 
testing approaches (Nayeem 2011). The cost metric 
used in both the binary and ternary online testing 
approaches is the number of elementary gates 
required to realize a logic function. Table 7 shows the 
comparison of average overheads among the two 
binary and the proposed ternary online testing 
approaches. 



TABLE 7. OVERHEAD COMPARISON 



Approaches 


Average 
overhead for 
testability 


(Vasudevan, Lala, Di, and Parkerson 2006) 


312.28% 
(Nayeem 2011) 


(Farazmand, Zamani, and Tahoori 2010) 


388.67% 
(Nayeem 2011) 


(Nayeem 2011; Nayeem and Rice 2011) 


4.21% 


Our approach 


688% 



Although the average overhead of our approach is 
higher than that of other approaches, the following 
should be considered. From the last column of Table 6, 
it can be seen that the overhead costs for the 
benchmarks SumG2 and SumG3 are significantly 
higher than other overheads. These costs are 
significant in increasing the average overhead of our 



approach. If the average overhead is calculated 
excluding the benchmarks SumG2 and SumG3, the 
average overhead for our approach is reduced to 308%. 
It is noted that the non-testable approaches for 
benchmarks SumG2 and SumG3 required less than 8 
gates whereas the minimum number of gates required 
for a ternary online testable design is 61. Therefore, it 
can be predicted that our approach is likely to 
minimize the overhead costs for larger Toffoli circuits. 

TABLE 6. COMPARISON OF NUMBER OF M-S GATES FOR IMPLEMENTING 
THE TERNARY BENCHMARK CIRCUITS. 



Benchmarks 


Proposed Methods 
5TR & RC 5TR,TRc& RC 
5TR,TRc,TRmc& RC 


Non- 
testable 
ternary 

gates 


Lowest 
overhead 


2CyG2 (2ab) 


65 


65 


65 


37 


164% 














3CyG2 (ab + be 

+ CO.) 


646 


523 


489 


92 


531% 














a2bccG (a2 + be 
+ c) 


480 


398 


364 


96 


379% 














ProdG2 (ab) 


65 


65 


65 


28 


217% 














ProdG3 (abc) 


148 


148 


148 


56 


250% 














SumG2(a ® 
b) 


65 


65 


65 


4 


1525% 














SumG3(affl b 
as c) 


148 


148 


148 


8 


1750% 



The approach discussed in Nayeem and Rice (2011) 
has very low quantum cost; however our design has 
some advantages over the design in that work. Their 
design is based on the fact that there should be an 
existing Toffoli gate cascade. Therefore if there is no 
existing gate cascade implementation for the function, 
or if the existing gate cascade uses a different type of 
gate, their approach can not be applied. Our design, 
however, can realize any ternary function in online 
testable form as long as the function is described as a 
TGFSOP. However as a result of our comparisons, we 
are motivated to continue our research to find ways to 
improve our technique in order to achieve compatible 
performance. 

Conclusion & Future Work 

In this paper, we present a design for an online 
testable ternary reversible logic block. Multiple such 
blocks can be used in conjunction with two-pair two- 
rail checkers to implement any ternary reversible 
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circuit. Different design approaches are suggested, 
with discussion as to how each approach might best be 
utilized. 

The fault model used in this work is a single-bit fault 
model, meaning that any single error in a block can be 
detected and will be propagated to the outputs 
through the use of the two-pair two-rail checkers. 

One of the major concerns of reversible logic synthesis 
is to keep the number of the input constants as few as 
possible. In our proposed designs, we have only one 
constant input. The other major concerns of reversible 
logic synthesis are to keep the number of garbage 
outputs and the length of gate cascades as small as 
possible. In our designs the number of garbage 
outputs is minimized by using the internal gates as 
intelligently as possible. The length of cascaded gates 
is also kept at a minimum by replacing 4TR or 
5TRblocks with TRc and TRmc blocks which are 
designated for copy operations whenever required. In 
this work we have detected and resolved the 
limitation of the design first proposed in (Rahman and 
Rice 2011b) and upgraded in (Rahman and Rice 2011a), 
as well as proposing additional blocks for the copy 
operation, which may also result in an additional 
reduction in the circuit's quantum cost. Future work 
will include further analysis as to the fault coverage 
and examination as to how the designs could be 
modified to work with other fault models. In addition, 
a synthesis process must be developed with heuristics 
for making the best choice among the alternate blocks 
in order to implement circuits with the lowest cost. 
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